Claim Amendments 



1.-46. (Canceled) 

47. (Previously Presented) A method, comprising: 
via a first participant equipment: 

detecting an acoustic signal; 

determining whether the detected acoustic signal was generated by a 
person speaking by receiving a frame of audio data derived from the detected 
acoustic signal; 

classifying the received frame based on spectral data of the received 
frame, the spectral data obtained by performing a modulated complex lapped 
transform (MCLT) on the frame of audio data, the classifying comprising 
classifying the received frame as one of the plurality of predetermined frame 
types comprising a live-type frame, a phone-type frame, and an unsure-type 
frame, wherein live-type frames represent frames determined to be derived from 
acoustic signals generated by a person speaking, and phone-type frames 
represent frames determined to be derived from acoustic signals generated by 
an audio transducer device; and 

providing a signal indicating to a second participant equipment that the 
detected acoustic signal was generated by the person. 

48. (Original) The method of claim 47, wherein determining whether the 
detected acoustic signal was generated by a person further comprises: 
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determining whether the detected acoustic signal was speech from an audio 
transducer device. 

49. (Original) The method of claim 47, wherein the signal is transmitted over a 
network to participants of a multi-party conference. 

50. (Previously Presented) The method of claim 47 wherein determining 
whether the detected acoustic signal was generated by a person further comprises: 

determining a source of a portion of the detected acoustic signal used to derive 
the frame based on the classification of the frame and a prior determination of a source 
of a detected acoustic signal portion. 

51 . (Original) The method of claim 50, wherein the spectral data is obtained by 
performing a frequency transform on the frame of audio data. 

52. (Original) The method of claim 51, wherein the spectral data is obtained by 
performing a fast Fourier transform (FFT) on the frame of audio data. 

53. (Previously Canceled) 

54. (Original) The method of claim 50, further comprising determining a first 
frequency band's energy and a second frequency band's energy from the spectral data. 



Serial No.: 10/677,213 

Atty Docket No.: MS1-1676US 

Atty/Agent: Jason D. Mehigan 




www.leehayes.com • 509.324.9256 



55. (Original) The method of claim 54, wherein the first frequency band 
corresponds to a frequency range for consonants and the second frequency band 
corresponds to a frequency range for vowels. 



56. (Original) The method of claim 55, further comprising classifying the frame 
as being generated by a person when the ratio of the energies of the first and second 
frequency bands exceeds a first predetermined threshold. 

57. (Original) The method of claim 56, further comprising selectively classifying 
the frame as being from a different source when the ratio of the energies of the first and 
second frequency bands is below a second predetermined threshold. 

58. (Original) The method of claim 57, further comprising selectively classifying 
the frame as having an unknown source when the ratio of the energies of the first and 
second frequency bands exceeds the second predetermined threshold and is below the 
first predetermined threshold, the second predetermined threshold being less than the 
first predetermined threshold. 

59. (Original) The method of claim 55, further comprising determining whether 
the frame was derived from speech, wherein speech includes acoustic signals 
generated by a person speaking and acoustic signals generated by an audio transducer 
device. 
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60. (Original) The method of claim 59, further comprising determining noise 
floor energies of the first and second frequency bands using the spectral data, wherein 
a frame is selectively classified as being derived from speech in response to the energy 
of the first frequency band exceeding the noise floor energy of the first frequency band 
or the energy of the second frequency band exceeding the noise floor of the second 
frequency band, or both. 

61. (Original) The method of claim 60, wherein classifying the received frame 
further comprises: 

determining whether a frame received within a predetermined number of frames 
relative to the received frame has substantially all of its energy in the second frequency 
band. 

62. (Original) The method of claim 60, wherein classifying the received frame 
further comprises: 

determining whether a frame adjacent to the received frame was classified as 
derived from speech. 

63. (Previously Canceled). 

64. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be an acoustic signal 
generated by a person, if: 
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the prior determination of a source of a detected acoustic signal portion is that 
the source was an audio transducer device; 

the frame is classified as a live-type frame; and 

a predetermined number of prior frames includes live-type frames that exceed a 
predetermined live-type frame count threshold. 

65. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be unsure, if: 

the prior determination of a source of a detected acoustic signal portion is that 
the source was an audio transducer device; 

the frame is classified as a live-type frame; 

a predetermined number of most recent frames do not include enough live-type 
frames to exceed a predetermined threshold; and 

an elapsed time since receiving a previous frame derived from speech exceeds a 
predetermined first time threshold. 

66. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be an audio transducer device, 
if: 

the prior determination of a source of a detected acoustic signal portion is that 
the source was an acoustic signal generated by a person speaking; 
the frame is classified as a phone-type frame; 
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an elapsed time since receiving a previous live-type frame exceeds a 
predetermined second time threshold; and 

a counter value does not exceed a predetermined count threshold, the counter 
value to track a number of consecutive non-live-type frames received after receiving a 
live-type frame of most recent frames do not include enough live-type frames to exceed 
a predetermined threshold. 

67. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be unsure, if: 

the prior determination of a source of a detected acoustic signal portion is that 
the source was an acoustic signal generated by a person speaking; 
the frame is classified as a phone-type frame; 

an elapsed time since receiving a previous live-type frame exceeds a 
predetermined second time threshold; and 

the counter value is below a predetermined count threshold, the counter value to 
track a number of consecutive non-live-type frames received after receiving a live-type 
frame of most recent frames do not include enough live-type frames to exceed a 
predetermined threshold. 

68. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be an acoustic signal 
generated by a person speaking, if: 
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the prior determination of a source of a detected acoustic signal portion is 
unsure; 

the frame is classified as a live-type frame; and 

a predetermined number of most recent prior frames includes live-type frames 
that exceed in number a predetermined live-type frame count threshold. 

69. (Previously Presented) The method of claim 47, further comprising 
determining the source of the detected acoustic signal to be an acoustic transducer 
device, if: 

the prior determination of a source of a detected acoustic signal portion is 
unsure; 

the frame is classified as a phone-type frame; and 

a predetermined number of most recent prior frames includes phone-type frames 
that exceed in number a predetermined phone-type frame count threshold. 

70. (Previously Presented) A computer-readable tangible medium having 
computer-executable instructions that, upon execution, facilitate a computing device in 
performing operations comprising: 

detecting an acoustic signal; 

determining whether the detected acoustic signal was generated by a person 
speaking by receiving a frame of audio data derived from the detected acoustic signal; 
determining a source of the detected acoustic signal to be unsure, if: 
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a prior determination of the source of the detected acoustic signal is that 
the source of the detected acoustic signal was an audio transducer device; 

the frame of audio data is classified as a live-type frame; 

a predetermined number of most recent frames do not include enough 
live-type frames to exceed a predetermined live-type frame count threshold; and 

an elapsed time since receiving a previous frame derived from speech 
exceeds a predetermined first time threshold; or 

the prior determination of the source of the detected acoustic signal is that 
the source was the acoustic signal generated by a person speaking; 

the frame of audio data is classified as a phone-type frame; 

an elapsed time since receiving a previous live-type frame exceeds a 
predetermined second time threshold; and 

a counter value is below a predetermined count threshold, the counter 
value to track a number of consecutive non-live-type frames received after 
receiving the live-type frame of most recent frames does not include enough live- 
type frames to exceed the predetermined count threshold; 
determining the source of the detected acoustic signal to be the acoustic signal 
generated by the person speaking, if: 

the prior determination of a source of a detected acoustic signal is unsure; 

the frame of audio data is classified as the live-type frame; and 

the predetermined number of most recent prior frames includes live-type 
frames that exceed in number the predetermined live-type frame count threshold; 
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determining the source of the detected acoustic signal to be the audio transducer 
device, if: 

the prior determination of the source of the detected acoustic signal is 
unsure; 

the frame of audio data is classified as the phone-type frame; and 
the predetermined number of most recent prior frames includes phone- 
type frames that exceed in number a predetermined phone-type frame count 
threshold; 

classifying the received frame of audio data based on spectral data of the 
received frame of audio data, the classifying comprising classifying the received frame 
of audio data as one of a plurality of predetermined frame types comprising the live-type 
frame; the phone-type frame; and an unsure-type frame, wherein live-type frames 
represent frames determined to be derived from acoustic signals generated by the 
person speaking, and phone-type frames represent frames determined to be derived 
from acoustic signals generated by the audio transducer device, wherein parameters 
used to classify frames include high band noise floor energy, low band noise floor 
energy, frame high band energy, frame low band energy, a ratio of the frame high band 
energy to the frame low band energy; 

classifying the received frame of audio data from non-spectral data of the 
received frame based on a parameters energy ratio threshold for live speech and an 
energy ratio for phone speech; and 

providing a signal indicating to a second computing device that the detected 
acoustic signal was generated by a person. 
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71. (Previously Presented) The computer-readable tangible medium of claim 
70, wherein determining whether the detected acoustic signal was generated by a 
person further comprises: 

determining whether the detected acoustic signal was speech from an audio 
transducer device. 

72. (Previously Presented) The computer-readable tangible medium of claim 
70, wherein the signal is transmitted over a network to participants of a multi-party 
conference. 

73. (Previously Presented) The computer-readable tangible medium of claim 
70, wherein determining whether the detected acoustic signal was generated by a 
person further comprises: 

determining a source of a portion of the detected acoustic signal used to derive 
the frame based on the classification of the frame and a prior determination of a source 
of a detected acoustic signal portion. 

74. (Previously Presented) The computer-readable tangible medium of claim 
73, wherein the spectral data is obtained by performing a frequency transform on the 
frame of audio data. 
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75. (Previously Presented) The computer-readable tangible medium of claim 
73, wherein the operations further comprise determining a first frequency band's energy 
and a second frequency band's energy from the spectral data. 

76. (Previously Presented) The computer-readable tangible medium of claim 

75, wherein the first frequency band corresponds to a frequency range for consonants 
and the second frequency band corresponds to a frequency range for vowels. 

77. (Previously Presented) The computer-readable tangible medium of claim 

76, wherein the operations further comprise: 

classifying the frame as being generated by a person when the ratio of the 
energies of the first and second frequency bands exceeds a first predetermined 
threshold. 

78. (Previously Presented) The computer-readable tangible medium of claim 

77, wherein the operations further comprise: 

selectively classifying the frame as being from another source when the ratio of 
the energies of the first and second frequency bands is below a second predetermined 
threshold. 

79. (Previously Presented) The computer-readable tangible medium of claim 

78, wherein the operations further comprise: 
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selectively classifying the frame as having an unknown source when the ratio of 
the energies of the first and second frequency bands exceeds the second 
predetermined threshold and is below the first predetermined threshold, the second 
predetermined threshold being less than the first predetermined threshold. 

80. (Previously Presented) The computer-readable tangible medium of claim 
76, wherein the operations further comprise: 

determining whether the frame was derived from speech, wherein speech 
includes acoustic signals generated by a person speaking and acoustic signals 
generated by an audio transducer device. 

81. (Previously Presented) The computer-readable tangible medium of claim 

80, wherein the operations further comprise: 

determining noise floor energies of the first and second frequency bands using 
the spectral data, wherein a frame is selectively classified as being derived from speech 
in response to the energy of the first frequency band exceeding the noise floor energy of 
the first frequency band or the energy of the second frequency band exceeding the 
noise floor of the second frequency band, or both. 

82. (Previously Presented) The computer-readable tangible medium of claim 

81 , wherein classifying the received frame further comprises: 

determining whether a frame adjacent to the received frame was classified as 
derived from speech. 
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83. (Previously Presented) The computer-readable tangible medium of claim 
81 , wherein classifying the received frame further comprises: 

determining whether a frame within a predetermined number of frames relative to 
the received frame has substantially all of its energy in the second frequency band. 

84. (Previously Canceled) 

85. (Previously Presented) The computer-readable tangible medium of claim 
70, wherein the operations further comprise determining the source of the detected 
acoustic signal to be an acoustic signal generated by a person, if: 

the prior determination of a source of a detected acoustic signal portion is that the 
source was an audio transducer device; 

the frame is classified as a live-type frame; and 

a predetermined number of prior frames includes live-type frames that exceed a 
predetermined live-type frame count threshold. 

86. (Previously Canceled) 

87. (Previously Presented) The computer-readable tangible medium of claim 
70, wherein the operations further comprise determining the source of the detected 
acoustic signal to be an audio transducer device, if: 
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the prior determination of a source of a detected acoustic signal portion is that 
the source was an acoustic signal generated by a person speaking; 
the frame is classified as a phone-type frame; 

an elapsed time since receiving a previous live-type frame exceeds a 
predetermined second time threshold; and 

a counter value does not exceed a predetermined count threshold, the counter 
value to track a number of consecutive non-live-type frames received after receiving a 
live-type frame of most recent frames do not include enough live-type frames to exceed 
a predetermined threshold. 

88.-95. (Canceled) 
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